On the Characterisation and Clustering of the Blogosphere
نویسنده
چکیده
s – C02 13.41% ± 0.26 15.89% ± 0.26 13.89% ± 0.32 10.03% ± 0.27 8.64% ± 0.28 5.28% ± 0.31 6.99% ± 0.26 7.09% ± 0.32 18.78% Short news –R8 17.71% ± 0.04 14.89% ± 0.01 17.22% ± 0.02 13.56% ± 0.02 8.52% ± 0.02 7.33% ± 0.01 7.04% ± 0.01 5.31% ± 0.01 8.42% Weblogs – WbN 19.98% ± 0.05 18.20% ± 0.03 18.69% ± 0.02 16.84% ± 0.02 8.25% ± 0.01 6.23% ± 0.01 5.01% ± 0.01 3.16% ± 0.01 3.64% Weblogs – WbW 18.69% ± 0.01 18.91% ± 0.01 20.51% ± 0.01 17.67% ± 0.01 8.6% ± 0.01 5.85% ± 0.005 4.28% ± 0.006 2.37% ± 0.005 3.12% Weblogs – WbLJ 22.40% ± 0.03 21.60% ± 0.04 19.60% ± 0.03 14.47% ± 0.02 6.47% ± 0.02 4.37% ± 0.01 4.08% ± 0.01 2.12% ± 0.01 4.89% Microblogs – Mb 26.31% ± 0.04 13.40% ± 0.02 16.19% ± 0.02 13.83% ± 0.01 8.66% ± 0.02 7.73% ± 0.01 5.4% ± 0.01 3.22% ± 0.01 5.26% Microblogs and weblogs showed consistent percentages i.e., not big variations over the lengths less than or equal to four, and as expected we obtained high percentages for short tokens (from one to four characters) due, among other factors, to slang or shortening words and informal text (Yang, 2011). Based on this information we can say that if more than 50% of a corpus contains tokens of length less or equal to four, it is highly probable to be a corpus of microblogs.
منابع مشابه
Second Space: A Generative Model for the Blogosphere
Analysing complex natural phenomena often requires synthesized data that matches observed characteristics. Graph models are widely used in analyzing the Web in general, but are less suitable for modeling the Blogosphere. While blog networks resemble many properties of Web graphs, the dynamic nature of the Blogosphere, its unique structure and the evolution of the link structure due to blog read...
متن کاملCharacterisation and Corrosion Performance of MultilayerNano Nickel Coatings on AZ31 Magnesium Alloy
Ni-P and Ni layers multilayer coatings were applied to AZ31 magnesium alloy utilizing electroless and electrodeposition procedures. The aim of the project was to decrease cracks and increase corrosion resistance of the coatings. In order to compare the coatings, the effect of single layer electroless Ni-P coatings with different thicknesses was also investigated. The microstructure and phase co...
متن کاملSecond Space: Generative Model for the Blogosphere
Analysing complex natural phenomena often requires synthesized data that matches observed characteristics. Graph models are widely used in analyzing the Web in general, but are less suitable for modeling the Blogosphere. While blog networks resemble many properties of Web graphs, the dynamic nature of the Blogosphere, its unique structure and the evolution of the link structure due to blog read...
متن کاملStructural Characterisation of a Polysaccharide from Radix Ranunculus ternati
A water soluble polysaccharide, HB-1, with a molecular weight of 23,930, was isolated from radix Ranunculi ternati. by hot water extraction, ethanol precipitation, deproteination,ultrafiltration and gel-filtration column chromatography. Its sugar composition was determined by GLC as Glc, Ara, and Gal in a molar ration of 16.071: 2.722: 1. And the absolute configuration of Glc was identified as ...
متن کاملPreparation, Characterisation and Antimicrobial Activities of Some Novel Nitriles and Imidazolines
Reaction between 5-methyl-3-aminoisoxazole and ?-acetamidophenylsulpho-nylchloride yielded compound 1. Hydrolysis of compound 1 gave a starting compound 4-methoxybenzal-?-5-methyl isoxazol-3-yl-sulphonamido aniline 2. The compound 2 on condensation with different aldehydes and potassium cyanide yielded the nitriles 3a-l. Cyclocondensation between oxazolinone and compound 2 yielded imidazolines ...
متن کامل